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CN . Abstract 

We investigate the problem of reconstructing sparse multivariate trigonometric polyno- 
■^>( , mials from few randomly taken samples by Basis Pursuit and greedy algorithms such as 

^^ ■ Orthogonal Matching Pursuit (OMP) and Thresholding. While recovery by Basis Pursuit 

(-H i has recently been studied by several authors, we provide theoretical results on the success 

probability of reconstruction via Thresholding and OMP for both a continuous and a dis- 
crete probability model for the sampling points. We present numerical experiments, which 
indicate that usually Basis Pursuit is significantly slower than greedy algorithms, while the 
recovery rates are very similar. 
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r^ : 1 Introduction 

H ! In the last two years the rapidly growing field of compressed sensing has attracted much 

attention [1, 8, 9, 12, 13, 14, 18, 33, 38]. Its basic idea is that sparse or compressible signals 
can be reconstructed from vastly incomplete non-adaptive information. By "sparse" we mean 
that a vector has only few non-zero coefficients, while "compressible" expresses that a vector 

C^ '_ can be well-approximated by a sparse one. 

Previous work on this topic includes the reconstruction of Fourier coefficients from sam- 
ples taken randomly on a lattice by the Basis Pursuit (BP) principle [8, 12]. This consists 
in minimizing the £^-norm of the Fourier coefficients subject to the condition that the cor- 
responding trigonometric polynomial matches the sampling points. Indeed, it was proven by 
Candes, Romberg and Tao in [8] in the setting of the discrete Fourier transform that this 
scheme recovers the coefficients exactly with high probability provided the number of samples 
is high enough compared to the sparsity, i.e., the number of non-zero coefficients. This result 
has been generalized by the second author of the present paper in [33] for the case of samples 
taken uniformly at random from the cube [0, In]'^. 

Another line of research suggests greedy methods such as (Orthogonal) Matching Pursuit 
(OMP) and Thresholding for sparse reconstruction tasks [28, 18, 41, 42]. OMP and Threshold- 
ing are conceptually simple to implement and potentially faster than BP. In particular, they 



may easily take into account fast algorithms for multiplication with the involved matrices, 
while most standard software for convex optimization [6, 29] does not allow this. 

This paper is devoted to the theoretical and numerical investigation and comparison of 
Thresholding, OMP and BP for the recovery of sparse trigonometric polynomials from ran- 
domly taken samples. Our theoretical results indicate that indeed all the methods are suitable 
for this task. The novelty in the present paper is a performance analysis for OMP and Thresh- 
olding, although the theoretical achievements for OMP are only partial so far. In contrast to 
BP, the greedy algorithms give only a non-uniform guarantee of recovery at a sufficiently small 
ratio of the number of samples to the sparsity. This means that they guarantee recovery with 
high probability only for the given trigonometric polynomial, while BP can actually guarantee 
recovery of all sufficiently sparse trigonometric polynomials from a single sampling set. 

In practice however, a non-uniform guarantee might be sufficient. Indeed, our numerical 
experiments suggest that OMP even slightly outperforms BP on generic signals with respect 
to reconstruction rate. Considering that greedy algorithms are usually significantly faster than 
BP one would probably use OMP for most applications despite its lack of uniformity. 

For related work on this topic, also known as compressed sensing, we refer to [1, 8, 12, 9, 
10, 11, 14, 15, 16, 17, 18, 38] and the references therein. For more information on sampling 
of (not necessarily sparse) trigonometric polynomials in a probabilistic setting the reader may 
consult [2, 4, 24]. 
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(a) Sparse coefficient vector. 
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(b) Trigonometric polynomial and a few samples. 



Figure 1: Sparse vector of Fourier coefficients and the corresponding trigonometric polynomial 
(real part). After sampling at a few randomly chosen points. Orthogonal Matching Pursuit 
(OMP), i.e.. Algorithm 1, as well as the Basis Pursuit principle recover the coefficient vector 
perfectly with high probability. 



The paper is organized as follows: After introducing the necessary notation, including the 
Orthogonal Matching Pursuit algorithm, we first review known results for Basis Pursuit. Then 
we present our main result concerning Thresholding and OMP. Based on the coherence param- 
eter we also provide uniform reconstruction results for Thresholding and OMP, cf. Subsection 
2.4. 

In Section 3 all proofs of the obtained results are given. Section 4 presents extensive 
numerical experiments. Finally, Section 5 makes conclusions and discusses possible future 
work. 



2 Main Results 

2.1 The Setting 

For some finite subset F C Z , d G N, we let Ilr denote the space of all trigonometric polyno- 
mials in dimension d whose coefficients are supported on F. Clearly, an element / of Ilr is of 
the form 

fix) = J2cke^''-\ xG[0,27r]^ 
feer 

with some Fourier coefficients Ck G C. The dimension of Hr will be denoted by D := |F|. 
Taking F = {—q, —q + l,..., q}'^ yields the space Ilr = n^ of all trigonometric polynomials of 
maximal order q. 

We will mainly deal with "sparse" trigonometric polynomials, i.e., we assume that the 
sequence of coefficients c^ is supported only on a set T, which is much smaller than F. However, 
a priori nothing is known about T except for its maximum size. Thus, it is useful to introduce 
the set nr(M) C Ilr of all trigonometric polynomials whose Fourier coefficients are supported 
on a set T C F satisfying \T\ < M. Note that 

nr(M) = [J Ut 

Tcr,\T\<M 

is not a linear space. 

Our aim is to sample a trigonometric polynomial / of Ilr{M) at N points xi, . . . ,xn G 
[0,2tt]'^ and try to reconstruct / from these samples. If for some tti G N the sampling points 

are located on the grid 

27r . r„ 27r 2Tr(m - 1)^ 
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m I m m 

then this problem can also be interpreted as reconstructing a sparse vector from partial infor- 
mation on its discrete Fourier transform. 

Basis Pursuit consists in solving the following ^^-minimization problem 

min||d||i:=^|4| subject to ^ 4e'"*"^ = / (a;,) , i = l,...,iV. (2.1) 

feer fcer 

This task can be performed with convex optimization techniques [6]. For real- valued coefficients 
(2.1) can be reformulated as a linear program while for complex-valued coefficients we obtain 
a second order cone program. For both kind of problems standard software exists, such as 
MOSEK [29] or CVX [21] (internally using SeDuMi [40]) and since recently also LIMAGIC 
[36] (only for real- valued coefficients). 

As an alternative to BP we also use greedy algorithms to recover the Fourier coefficients 
of / from few samples. In particular, we study Orthogonal Matching Pursuit (Algorithm 1) 
as well as the very simple Thresholding algorithm (Algorithm 2). We need to introduce some 
notation. Let X = {xi, . . . ,XAr} be the set of (random) sampling points. We denote by Tx 
the N X D matrix (recall that D = |F|) with entries 

(^x)i,fc = e^'■"^ l<j<N,k€T. (2.2) 



Algorithm 1 OMP 



Input: sampling set X C [0, 27r]^, sampling vector f := {f{xj))f^i, set T C Z'^. 

Optional: maximum allowed sparsity M or residual tolerance e. 



Set s = 0, the residual vector ro = f, and the index set Tq = 0. 
repeat 

Set s = s + 1. 

Find kg = argmax^er K^'s-i, 0fc)| and augment T^ = Tg-i U {ks}. 

Project onto spanj^fc, k £ Tg} by solving the least squares problem 

W^TsXds -f||2 -^ min. 

Compute the new residual r^ = f — J-x^xds- 
until s = M or ||rs|| < e 
Set T = Tg, the non-zeros of the vector c are given by {ck)k<^T = dg- 



Output: vector of coefficients c and its support T. 



Then clearly, f{xj) = {J-xc)j if c is the vector of Fourier coefficients of /. Let (j)^ denote the 
fe-th column of J^x-, i-e., 




so Tx = {(j)kMk2\---\(t>ko)- By 

(^Tx),,fe = e^^■"^ 1 < i < iV, fc G T. (2.3) 

we denote the restriction of J-x to sequences supported only on T. Furthermore, let (•, •) be 
the usual Euclidean scalar product and || • ||2 the associated norm. We have H^fclb = vN for all 
/c G r, i.e., all the columns of !Fx have the same £^-norm. We postpone a detailed discussion 
on the implementation of Algorithm 1 to Section 4. 

Of course, the hope is that running OMP or BP on samples of some / G nr(M) will 
recover its Fourier coefficients. To analyze the performance of the algorithms we will use two 
probabilistic models for the sampling points (one for the continuous case and the other one for 
the discrete Fourier transform case). This random modeling can be understood in the sense 
that the sampling set X is 'generic': reconstruction is allowed to fail for certain choices of X, 
as long as the probability of encountering such a pathological case is very small. 

We will work with the following two probability models for the sampling points: 

(1) We assume that the sampling points xi, . . . , xx are independent random variables having 
the uniform distribution on [0, 27r] . Obviously, the cardinality of the sampling set X = 
{xi, . . . , xat} equals the number of samples N with probability 1. 

(2) We suppose that the sampling points xi, . . . ^xx have the uniform distribution on the 
finite set — 2^m fo'^ some mGN\{l}. It will then always be assumed implicitly that 



Algorithm 2 Thresholding 



Input: sampling set X C [0, 27r] , sampling vector f := (/(a^j)):^i, set F C Z , 
maximum allowed sparsity M 

1: Find the indices T C F corresponding to the M largest inner products {|(f, (j)k)\}k£T- 
2: Project onto span{(j)k, k £ Tg} by solving the least squares problem 

W^Txd — f II2 -^ niin . 
3: The non-zero entries of the vector c are given by {ck)keT = d. 

Output: vector of coefficients c and its support T 

F C Z^. Clearly, our second model aims at studying the problem of reconstructing sparse 
vectors from (partial) information on its discrete Fourier transform. 

Observe that it happens with non-zero probability that some of the sampling points 
coincide, so the cardinality of the sampling set X = {xi, . . . , xn} might be smaller than 
A'^. However, for small N <^ D this effect will occur rather rarely, and also does not harm 
our theoretical analysis. 

We will often refer to the first model as the "continuous model" while the second will be 
called the "discrete model" . It turns out that one can treat both probability models in parallel. 

2.2 Previous results for Basis Pursuit 

Based on ideas due to Candes, Romberg and Tao in [8], Rauhut has shown the following 
result concerning recovery of sparse trigonometric polynomials in [33, Theorem 2.1 and Section 
3.6(b)]. 

Theorem 2.1. LetT C Z'^ finite andT C F. Set M = \T\ and D = |F|. Let X = {xi,...,xn) 
be chosen according to one of our two probability models. If for some e > it holds 

N > CM\og{D/e) 

then with probability at least 1 — e every trigonometric polynomial f G nr(M) with Fourier 
coefficients supported on T can be recovered from its sample values f{xj),j = 1,...,N, via 
Basis Pursuit. The constant C is absolute. 

In particular, if the sparsity M is small and the dimension D large then we may choose 
the number A^ of samples much smaller than D (but larger than M), and we are still able to 
recover a polynomial / G IIy{M) exactly - at least with high probability. 

Note that the theorem is non-uniform in the sense that it does not guarantee that a single 
sampling set {xi, . . . , xn} is good for all support sets T of a certain cardinality M. This slight 
drawback can actually be removed at the cost of introducing additional log-factors by using 
the concept of the uniform uncertainty principle introduced in [9, 12]. For an iV x D matrix A 



and M < D the restricted isometry constant 5m is defined as the smallest number such that 
for all subsets T C {1, . . . , D} with cardinality \T\ < M 

{1-5m)\\x\\1<\\Atx\\1<{1 + 5m)\\x\\1 

for all coefficient vectors x supported on T. Here At denotes the submatrix of A consisting of 
the columns indexed by T. In [9] the following general recovery theorem for BP was proved. 

Theorem 2.2. Assume that A is a matrix satisfying 

5m + 52M + 53M < 1. (2.4) 

Then every vector x with at most M non-zero entries can be recovered from y = Ax by BP. 

The above statement is deterministic, but unfortunately, it is hard to check that a deter- 
ministic matrix A has small enough (5m 's- So the strategy is to prove that a random matrix, in 
particular, our matrix J^x, satisfies condition (2.4). Indeed, this was first achieved by Candes 
and Tao in [12] for partial discrete Fourier matrices J-'x, and later improved by Rudelson 
and Vershynin in [38]. Recently, Rauhut [34] used Rudelson and Vershynin's method to come 
up with an analog result for J^x with samples X drawn from the continuous distribution on 
[0,27r]°'. More precisely, if 

iV/log(iV) > CsMlog'^{M)log{D)log^{e-^) (2.5) 

then with probability at least 1 — e the restricted isometry constant of the N x D matrix 
N^^I'^Tx satisfies bu ^ <5, both for the continuous and discrete probability model, see [34, 
Theorem 3.2], [38]. The above condition is particularly good for small M, but is always satisfied 
if A^ > CM log (D) log (e~^). (It seems that the log (e^^)-factor can be improved to log(e~"^), 
see [39]). We note that Candes and Tao obtained the condition A^ > CMlog (L>)log(e~^) 
(substitute e = cD~P''^ in [12, Definition 1.12] of the uniform uncertainty principle and use 
[12, Lemma 4.3]). 

Combining Theorem 2.2 and condition (2.5) gives a uniform result for recovery of sparse 
trigonometric polynomials by BP, see also [12, 34, 38]. Thus, a single sampling set X may be 
good for all / G nr(M), in particular for all support sets T. 

Note that in [9] an extension of Theorem 2.2 to noisy and non-sparse situations is provided. 

2.3 Recovery results for greedy algorithms 

Let us first consider recovery by thresholding. Our theorem reads as follows. 

Theorem 2.3. Let f £ Il-p{M) with Fourier coefficients supported on T with \T\ = M . Define 
its dynamic range by 

_ maxfcgTicfcl 

miufcerjcfcl' 

Choose random sampling points X = (xi, . . . ,xn) according to one of our two probability 
models. If for some e > 

iV > CMi?2 iog(4D/e) (2.6) 

then with probability at least 1 — e thresholding recovers the correct support T and, hence, also 
the coefficients c. The constant C is no larger than 17.89. 



Hence, also with thresholding we can recover a sparse trigonometric polynomials from 
few random samples. Again the number of samples N is linear in the sparsity. However, 
compared to Theorem 2.1 for Basis Pursuit there is an additional dependence on the dynamic 
range R. Moreover, the above theorem is non-uniform in contrast to the available results for 
Basis Pursuit. This means that recovery is successful with high probability only for the given 
support set T and coefficient vector c. It does not guarantee that a single sampling set X is 
good for recovering all sparse trigonometric polynomials. Indeed, if A^ < CM'^ it can be shown 
that given the random sampling set X there exists with high probability a coefficient vector 
c = c{X) supported on T for which thresholding fails [35] (compare also the next section). 

Let us now consider Orthogonal Matching Pursuit. One can expect that this iterative strat- 
egy performs better than the simple Thresholding algorithm. In particular, the dependence 
of the number of samples N on the dynamic range R as in (2.6) should be removed. This is 
indeed indicated by the next theorem. 

Theorem 2.4. Let f G nr(M) with coefficients supported on T. Choose random sampling 
points X = (xi, . . . ,xn) according to one of our two probability models. If 

N> CMlog{8D/e) 

then with probability at least 1 — e OMP selects an element of the true support T in the first 
iteration. The constant C is no larger than 32.62. 

Our numerical experiments indicate that this result should extend to all iterations so that 
finally the true sparse polynomial / is recovered. Unfortunately, we have not yet been able to 
analyze theoretically the further iterations of OMP because of subtle stochastic dependency 
issues. 

Again the above theorem is non-uniform and we refer to the next section for a uniform 
result, which, however, requires much more samples. Similarly as for thresholding, one can 
show that if A^ < CM^'"^ then with high probability there exists an M -sparse coefficient vector c 
depending on the sampling set such that OMP fails in the first step [35]. We remark, however, 
that if OMP selects a wrong element in some step it might nevertheless recover the right 
polynomial. Indeed, if after M' > M steps the true support T is contained in the recovered 
support T' then the coefficients in T' \ T will usually be set to 0, since with high probability 
J-T'x is injective (see also Lemma 3.2 in [33]). A precise analysis when this situation occurs 
seems to be quite involved, however. 

2.4 Uniform results based on coherence 

Many results for OMP rely on the so called coherence parameter ^ [41, 42]. It measures the 
correlation of different columns of the measurement matrix Tx- It requires that they have 
unit norm, so we define (pk '■= A^~^ 0fc resulting in \\(f)k\\2 = 1- Then fi is defined as 

H := max\{(j)j,(j)k)\ = N~'^ ma.x\{(pj,(pk)\- (2.7) 

Reformulating Corollary 3.6 in [41] for our context yields the following result. 

Theorem 2.5. Assume (2M — 1)^ < 1. Then OMP (and also BP) recovers every / G nr(M). 



Proof: The matrix J-tx is injective for all T with \T\ < M due to the condition on ^u, see 
e.g. [42, Proposition 4.3]. This ensures uniqueness of a polynomial / G nr(M) having the 
vector (/(xi), . . . , /(xjv)) of sample values. The rest of the proof is the same as the one of 
Corollary 3.6 in [41]. ■ 

A similar result holds also for thresholding, see e.g. [22]. Again there is an additional 
dependence on the ratio of the largest to the smallest coefficients. 

Theorem 2.6. Assume (2M — 1)// < R~^ for some R> 1. Then thresholding recovers every 
/ G nr(M) whose Fourier coefficients c satisfy 

max^jgg^pp c I Cfc I 

— '■ i — r — 

n^iiifcesuppc |Cfc| 

So clearly, we need to investigate the coherence of the random matrix J^x ■ 

Theorem 2.7. Let X = (xi, . . . , xn) be chosen according to one of our two probability models. 
Suppose that 

N > C{2M - if \og{AD'/e), (2.8) 

where D' := #{j — /c : j, fc G F, j 7^ /c} < D^ . Then with probability at least 1 — e the coherence 
of J^x satisfies 

{2M -l)fi< 1, 

and consequently OMP recovers every M-sparse trigonometric polynomial. The constant sat- 
isfies C < 4 -\ — j= ~ 4.94. In case of the continuous probability model it can be improved to 
C = 4/3. 



Of course, by Theorem 2.6 a similar result applies also to thresholding. Note that always 
D < D' < D^ . If r = Z^„ and X is chosen at random from the grid —'^m then actually 
D' = D — 1 because j — k is computed in the "periodic sense" . 

In contrast to the results of the previous section the above theorem is uniform in the sense 
that a single sampling set X is sufficient to ensure recovery of all sparse signals. However, 
it clearly has the drawback that now the number of samples N scales quadratically in the 
sparsity M. Apart from perhaps the constant C and the logarithmic scaling in the dimension 
D condition (2.8) actually does not seem to be improvable if one requires exact recovery of all 
sparse trigonometric polynomials from a single sampling set X, see the remark after Theorem 
2.4. So in this regard there is a crucial difference between Basis Pursuit and greedy algorithms. 
For certain applications a non-uniform recovery result might be enough and then greedy al- 
gorithms work well (and usually much faster than BP), see also the numerical experiments 
in Section 4. But if one requires uniformity (and speed is not a concern) then Basis Pursuit 
seems to be the method of choice. 

2.5 Related Work 

BP with other measurement ensembles. Other choices of the measurement matrix A 
(instead of Tx) were also investigated by several authors, see e.g. [12, 14, 15, 16, 17, 37, 38]. 
For instance, it was shown (see e.g. [1, 12, 13, 37, 38]) that an N x D random matrix A with 



independent Gaussian distributed entries (with variance N ^) has restricted isometry constant 
5m < S with probabihty at least 1 — e provided 

N >CsM ^og(-^Y (2.9) 

Similar estimates are possible for Bernoulli matrices, i.e., random matrices with independent ±1 
entries. Condition (2.9) is slightly better than (2.5). So for certain applications of compressed 
sensing Gaussian / Bernoulli matrices A might be useful. However, such "completely random" 
matrices have the disadvantage of not being structured, hence in contrast to the Fourier case no 
fast algorithms are available for matrix vector multiplication. Moreover, the "samples" y = Ax 
lack a physical meaning. Of course, this does not matter in encoding / decoding problems. 
However, there are possible applications where the samples result as the output of a physical 
measurement or an image reconstruction problem [8] and can really be interpreted as samples 
of a trigonometric polynomial. Clearly, in such a case one must use the Fourier matrix J^x- 

Orthogonal Matching Pursuit. Concerning OMP (or greedy algorithms in general) in 
connection with compressed sensing much less investigations were done so far. Gilbert and 
Tropp proved the following result [18, Theorem 2] (set e = 2D~p to arrive at the formulation 
below) . 

Theorem 2.8. Let x be an M -sparse vector in M^. Let A be an N x D random matrix with 
independent standard Gaussian entries. Assume that for some e > 

N > 8Mlog{2D/e). 

Then Orthogonal Matching Pursuit reconstructs x from the given data y = Ax with probability 
exceeding 1 — e. 

The method in [18] actually applies to more general types of random matrices A, in partic- 
ular, to Bernoulli matrices. However, it is heavily used that the columns of A are stochastically 
independent, so unfortunately their approach cannot be applied directly to the random Fourier 
matrices J^x- 

The above theorem is non- uniform, i.e., recovery is successful with high probability only for 
a given sparse vector x. It cannot be guaranteed that with high probability a single Gaussian 
measurement matrix A suffices for all M-sparse vectors x. As in the Fourier case Theorem 2.8 
becomes actually false when requiring reconstruction for all coefficients supported on a given 
set T, see e.g. [16, Section 7]. 

Ordinary Matching Pursuit Ordinary Matching Pursuit is slightly simpler than OMP. 
The difference lies in steps 5 and 6 of Algorithm 1. Instead of performing the orthogonal 
projection over the whole space of previously chosen elements (j)k, the new iterate and residual 
are computed as 

ds = ds-i + {rs-.i,4>ks)eks, 
rs = rs-i - {rs-i,(l)ks)4>ks, 

where e^ denotes the fc-th canonical unit vector. This step is clearly faster than the orthogonal 
projection step. However, in contrast to OMP it may now happen that some elements kg are 
selected more than once. So even if MP picks a correct element k £ T in every step it usually 
has not yet reconstructed the right coefficients after M = \T\ steps. However, Gribonval 



and Vandergheynst showed that exponential convergence can be guaranteed under the same 
condition as for OMP in Theorem 2.5, i.e., (2M — l)/i < 1, see [23, Featured Theorems 1 
and 2]. Hence, the uniform recovery result Theorem 2.7 has also an immediate application to 
ordinary Matching Pursuit. Moreover, Theorem 2.4 is applicable as well, since the first step 
of ordinary Matching Pursuit and of Orthogonal Matching Pursuit coincide. 

Sublinear Algorithms for Sparse Fourier Analysis. Finally, we would like to point 
out that in the series of papers [19, 20, 46, 47] an approach from [27] has been considered. 
The algorithms are based on so called isolation and group testing from theoretical computer 
science. It has been proven that one can recover a sparse trigonometric polynomial / G IIt{M), 
r = Zjn, from N randomly chosen samples on -^^m if 

N > CM"" log'^(i:)) log^(l/e) log'^(i?), with some a, /3, -/, 6 e N. 

These results have been generalized to some extend for multivariate trigonometric polynomials 
in [20, 46, 47]. Moreover, [20] indicates that A^ > C'M log^' (D) log'^'(l/e) log^'(i?) samples 
suffice if the random sampling set possesses additional structure. 

The striking point in these algorithms is the fact that the computation time scales "sub- 
linear" as log^(D) in the dimension D. However, numerical experiments in [46, 47] show even 
for small sparsities M a rather large crossover point with the classical FFT which scales as 
Dlog{D). It seems that in practice these algorithms require much more samples than OMP 
and BP, so that the main concern of these methods is speed rather than a minimal number of 
samples. 

3 Proofs 

A main tool for our proofs is Bernstein's inequality, see e.g. [44, Lemma 2.2.9] or [3]. 

Theorem 3.1. Let Yi, i = 1, . . . ,N , be a sequence of independent real-valued random variables 
with mean zero and variance ^\Y^\ < v for all i = 1,. . . ,N. Assume that [1^1 < B almost 
surely. Then 

0,2 



I I Vyj > X I < 2exp (--■ 



2Nv + Bx/Z^ 
Bernstein's inequality allows to prove the following concentration inequality. 

Lemma 3.2. Assume that c is a vector supported on T . Further, assume that the sampling 
set X is chosen according to one of our two probability models. Then for j ^ T and x > it 
holds 



P{\N-'^{J='txc,(I)j)\ >x) <4exp -N 
Proof: Note that 



x^ 



^11^112 + syill^lll^. 



where 



N N 

keT 1=1 e=i 

keT 
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Since the Xi have the uniform distribution either on the grid ^^^ or on the cube [0,27r]'^, 
and since j ^ T it is easy to see that EY^ = 0. Furthermore, the random variables Y^ are 
independent and bounded, 

\Ye\ < ^|cfc| = ||c||i. 
fceT 



Their variance is given by 

E[|y,|2] = I 



Y^ CfcC^e^('=--'')-^^e-*('='-^>^^ 



J] c,cFE[e^(^-'^')-^] = J]|c,|2 = ||c| 

k,k'&T k(^T 



_k,k'eT 

Hereby, we used that E[e*''^^'^ '^^] = 6^^^'- Thus, we obtain 

^{\N-\j^TXC,cPe)\>x) = p|iV-i|^y,| >x] 

< P (iV"^|^Re(Y^)| > x/\/2 I +P (iV"^|^Im(Y£)| > x/\/2 j . 

Applying Bernstein's inequality to Re{Yi) and Im(y£), ^ = 1, . . . , A^, (using that | Re(y£)| < \Yi\ 
and I Im(y£)| < jl^l) we obtain 

^(\N-\Ttxc,H)\ >x) < 4exp I --^^—-^ ^ 1 



2iV||c||i + ^||c||iAxy 
= 4exp(-A J" ^ ]. 

This concludes the proof. ■ 

3.1 Proof of Theorem 2.3 

Thresholding recovers the correct support if 

jdT ■' kfT 

Observe that if Z G T then the triangle inequality yields 

|A"^(0z,J^TXc)| = \ci + A"^((/)i,J^(r\|,})xCr\{i})| 

> mill I9I - max |A-i(</.j, .F(r\{j})xCT\0})l, 

where C'yxw, denotes the vector c restricted to the indices in T \ {/}. Thus, thresholding is 
certainly successful if 

rnax |A""^((/)j, JP"(y\lj})xCT\{j})| <min|cj|/2 and max |Ar~-'-((/)ji., J^txc)| <min|cj|/2. 



11 



We conclude that the probabihty that thresholding is not successful can be upper bounded by 

P (max\N~'^{(t)j,J='^T\{j})xCT\{j})\ > min|cjl/2J + P ( max|iV-i(0fe, J^TXc)| > niin|cj|/2 
< Y.^ (\N-\^j,J^^T\{j})xCT\{j})\ > min|c,|/2) + J] P (|iV-^./.fc,^TXc)| > min|c,|/2) • 

Applying Lemma 3.2 we can bound the probability that thresholding fails by 

.1^1 / ,. min|coP/4 \ 

4|r| exp -N , ' ^' ^„ ^— 

y 4:max\\cT\{j}\\2 + 575 max ||ct\|^-}||i mm |cj|/2y 

+ 4P - |T|) exp (-N , T''^'''^' ^ 



4||c||i + ^||c||imin|cj|/2^ 



,^ / ,^ minlciP 

<4Dexp -A^ ' ^' 



16||c||| + ^||c||imin|cj| 



<4^-pf-^ , ,. "^s''^''' ^ ^1 (3.1) 

\ 16M max |cjp + — ^Mmax \cj\ min |cj| / 



A min | c,- P 1 

373, 



<4Dexp ( --- """', ^' ^^ 1 , (3.2) 

' Mmax c,P 16 + -^ ' 



where min|cj| and max|cj| are always taken over j £ T. We note that in (3.1) the following 
obvious estimates were used, 

llclU < Mmax Icol , and ||c||i <Mmax|ci|. 
II 11^ - .^j. I Ji ' II Hi - .^j, I Ji 

Requiring that the term in (3.2) is less than e and solving for A shows the claim. 

3.2 Proof of Theorem 2.4 

As basic ingredient of the proof we will use that a submatrix J-tx is well-conditioned under 
mild conditions on the number of sampling points A and sparsity M. The corresponding result 
is taken from [24]. It is based on an analysis in [33] of the expectation of the Frobenius norm 
of high powers of NI — J^^-^J^tx, which uses a combinatorial argument based on set partitions 
and is inspired by [8]. 

Theorem 3.3. Let T C Z 6e of size \T\ = M and let xi, . . . ,xn be chosen according to one 
of our two probability models. Choose €,6 £ (0, 1) and assume 



6^N 
3eM 



> log(c(<5)M/e), 



where c{6) = (1 — e ^6'^) ^ < (1 — e ^) ^ ^ 1.582. Then with probability at least 1 — e the 
minimal and maximal eigenvalue of T^-^ J-tx satisfy 

l-SKXminiN-^J'TX^Tx), and \m..{N-^J'TX^Tx)<l + 5. (3.3) 
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Written in a more compact form, if 

N> C6~'^M\og{M/e) 

then (3.3) holds with probabihty at least 1 — e. It seems that also techniques developed by 
Vershynin [45] can be used to provide a version of Theorem 3.3. 

Now let us turn to the proof of Theorem 2.4. (Orthogonal) Matching Pursuit recovers an 
element of the support T in the first step if 

maXj^r|((/)j,J^TXc)| _ max^^y |iV~^(0j, J^txc)| 



m&-Kk(iT\{4>k-,^TXc)\ \\N ^F^x^TXcW 

TX" 



Assume for the moment that we are on the event that the minimal eigenvalue of N ^T^-y-Ttx 



is bounded from below by 1 — (5, cf. Theorem 3.3. Then 

\\N-^T*^x^Txc\U > M-'/^N-^T^x^Txch > M-V2(i _ s)\\ch. 
Hence, we need to bound the probability that 

ma,x\N^^{(t)j,J^TXc)\ < , — llclb. 



Using Lemma 3.2 we can estimate the probability of the complementary event by 

p(max|iV-i((/>j-,.FTxc)| > i^||c||2 ) 

<Y^¥{\N-\ct>,,FTXc)\>^-^\\ch^ 



N (l-<J)'||c||i 



< A{D - M) exp 

< A{D - M) exp 



^4||c||1 + ^||c||i||c||2(1-<5)/Vm^ 
N {l-6f \ 



M4 + ^(l-5) 



In the last step, we applied the Cauchy-Schwarz inequality ||c||i < vAf||c||2. 

Altogether, the probability that OMP fails in the first step can be bounded by 

nKnniN-^J'TxJ'Tx) <l-5)+A{D-M) exp {-^-S^^—\ . (3.4) 

Now choose 5 = 1/2. Then by Theorem 3.3 the first term above can be bounded by e/2 

provided 

N (2 \ 

> log ——M/e . (3.5) 

12eMj - ^ Vl-l/(4e) / 

Further, requiring that also the second term in (3.4) be less than e/2 yields 



N> ( 16 + —^ J Mlog(8(D - M)/e). (3.6) 

Since always M < D there exists a constant C < 32.62 such that 

N> CMlog{8D/e) 
implies that both (3.5) and (3.6) are satisfied. This concludes the proof. 
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Remark 3.4. Clearly, we would like to analyze also the further steps of OMP. However, 
starting with the second step the coefficients c^'^> (as well as the current support set Tg) of the 
current residual r^ depend on the chosen random sampling points. In this situation Lemma 3.2 
cannot he applied directly anymore and the analysis becomes more difficult. 

3.3 Proof of Theorem 2.7 

Let k,j £ T with k / j. Observe that 

N N 



N-\<t>„^k) = N'' Y. e^^'"'^"' = ^'' E ^/' ' ^i 



e=i e=i 



The random variables f/-'' ''^ obey E[y/-'' ''^] = 0, |y/-'' ''^l = 1, and E[|y/^' ^^p] = 1. Hence, 

the union bound and Bernstein's inequahty apphed to Re(y^ ) and Ini(y^ ), see also the 
proof of Lemma 3.2, yields 



N N / _ 2 \ 

fm^axiV-i| ^y/^-^)| > x) < ^ V{N'^\Y,Y^'~^\ > x) < 4Z)' exp -^ . (3.7) 



i¥=k i=l i,fcer i=i \ 3^^" 



Setting X = (2M - 1)"^ yields 

N 



"{{2111 - 1)^> 1) < 4D'exp 



(2M-1P4 + ^, 



Requiring that the latter term be less than e and solving for N shows the claim. 

In case of the continuous probability model (1) it is possible to improve the constant. In 
[32] a sharp moment bound (Khintchine's inequality) for the sum of the variables 1^ is 

provided, i.e., 

N 

E[| J^y/-''~''VP] <p!iVP. (3.8) 

£=l 
Markov's inequality yields for k £ (0, 1), see also [43, Proposition 15], 

N 



N 



exp 



-iv-^iE^/ 



=1 



^ p\ ~ ^ 1 - K ■ 

p=0 '■ p=0 

Using this estimate instead of Bernstein's inequality, the above derivation (3.7) yields 

n(2M-l),>l)<^I>'exp(-^^J^.). 

Choosing for instance k = 3/4 results in the slightly improved condition, cf. (2.8), 

N > -{2M-lflog{AD'/e). 
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If the sampling points are chosen at random from the grid "^l^m then the moment bound 
(3.8) remains vahd only for p = 1,. . . ,m—l. One still expects that also in this case the constant 
C can be improved (possibly depending on m), but the method above is not applicable directly 
any more. 

4 Implementation and Numerical Experiments 

4.1 Analysis of the time complexity 

Orthogonal Matching Pursuit, cf. Algorithm 1, contains two costly computations. Step 4 
multiplies the adjoint measurement matrix T"^ with the current residual vector r^. When 
drawing the sampling set from the lattice -^"^m or from the cube [0,27r] , we use a zero 
padded fast Fourier transform (FFT) or the nonequispaced FFT [31, 25], respectively. In both 
cases, the total costs of this step in one iteration is O(-Dlog-D). Note furthermore, that if the 
maximum in step 4 occurs at several indices the algorithm chooses one of them. 
Step 5 solves in each iteration a least squares problem 

W^^TsXds -f||2 -^ min. 

A straightforward implementation yields costs 0{MN'^) per iteration. We accelerated this step 
by a QR factorization of J-t^x computed from the factorization of Tt^-^x^ cf. [5, pp. 132], 
which reduces the costs to 0{N'^) per iteration Alternatively, we use the iterative algorithm 
LSQR [30] which solves the least squares problem with only 0{MN) floating point operations 
since the matrices Tt^x have uniformly bounded condition numbers, cf. Theorem 3.3. 

Clearly, if OMP succeeds, the algorithm takes M outer iterations. Two reasonable choices 
for stopping criteria are a maximum number of iterations (assuming an upper bound on the 
sparsity M is known) or a residual tolerance e (or a combination of both). In any case the 
algorithm will do no more than A^ iterations. Assuming M outer iterations, a reasonable 
sparsity M = 0{\/D), and N = 0{M\ogD) sampling points as suggested by Theorem 2.4, 
Algorithm 1 (using LSQR) has a total cost of 0{D^'^ log D) arithmetic operations. 

Thresholding, cf. Algorithm 2, avoids the iterative procedure and thus takes 0{D\ogD) 
operations if we assume again M = 0{vD) and N = 0{M\ogD), see also Theorem 2.3. Note 
that the computation of the inner products ^x'^'s becomes the dominant task in both schemes 
\iM = o{y/D). 

4.2 Matlab toolbox for Thresholding, OMP, and BP 

For the reader's convenience, we provide an efficient and reliable implementation of the pre- 
sented algorithms for the univariate case in Matlab. Following the common accepted concept of 
reproducible research, all numerical experiments are included in our publicly available toolbox 
[26]. The toolbox comes with a simple version of the nonequispaced FFT. 

All examples testing the Basis Pursuit principle use the optimization tools of CVX [21], 
LIMAGIC [36], or MOSEK [29], respectively. If the vector of Fourier coefficients is assumed 
to be real valued, the ^^-minimization problem is reformulated as a linear program, whereas 
for complex valued coefficients the corresponding second order cone problem is set up. CVX 
and MOSEK handle both problems but the matrix Tx has to be stored explicitly and no fast 
matrix vector multiplications by FFTs can be employed. In contrast, LIMAGIC includes the 
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use of function handles to avoid this memory bottleneck and reduces the computation time 
from 0{DN) to 0{D\ogD) when multiplying with the matrix Tx- Unfortunately, the solver 
for equality constraint ^^-minimization of this package supports only real valued coefficients. 

4.3 Testbed and examples 

All numerical results were obtained on a Intel PentiumM with 1.6GHz, 512MByte RAM run- 
ning OpenSUSE Linux kernel 2.6.13-15-default and MatLab7.1.0.183 (R14) Service Pack 3. 
Subsequently, we compare our implementation of Algorithm 1 and Algorithm 2 with different 
Basis Pursuit implementations for the univariate case. 

For given dimension D G 2N we set T = {-D/2, -D/2+1, ... , D/2-1}. The support T CT 
of the Fourier coefficients is then chosen uniformly at random among all subsets of F of size M. 
We use two choices of (pseudo-)random Fourier coefficients. If not stated otherwise, the real 
as well as the imaginary part of the supported coefficients is drawn from a normal distribution 
with mean zero and standard deviation one. For some experiments with Thresholding, we 
choose coefficients with modulus one and draw the phase which from the uniform distribution 
on [0,27r]. 

The sampling points Xj are drawn uniformly from the interval [0, 27r] for the continuous 
probability model, denoted by NFFT subsequently. Within the discrete probability model a 
subset of size N is chosen uniformly among all subsets of {0, ^, . . . , "^ j^~ } with size N, 
denoted by FFT. 

Example 4.1. In our first example, we compare the ability of OMP, Thresholding, and BP to 
reconstruct sparse trigonometric polynomials with complex valued coefficients for the dimension 
D = 50. We draw a set T of size M G {1, 2, . . . , 40} and M complex valued Fourier coefficients 
(with modulus one and a uniformly distributed phase in Figure 2(a)). Furthermore, we choose 
N = AQ sampling points within the discrete or the continuous probability model. The samples 
{xj, f{xj)), j = 1,...,N, of the corresponding trigonometric polynomial and the dimension 
D are the input for OMP, Thresholding, and BP. OMP and Thresholding use (updated) QR 
factorizations to solve the least squares problems, BP uses the MOSEK-package [29] to solve 
the second order cone problem. The output dt G C, A; G F, of all algorithms is compared to the 
original vector of Fourier coefficients. Repeating the experiment 100 times for each number M 
of non-zeros, we count how often each algorithm is able to reconstruct the given coefficients. 
Furthermore, the average CPU-time used by each algorithm with respect to the number of non- 
zero coefficients is shown. The same experiment is done for the dimension D = 1000. 

As readily can be seen from Figure 2(d) and 2(f), Thresholding, OMP, and BP differ 
by orders of magnitude in their computation times. Note furthermore, that the number of 
outer iterations and hence the computation time growths linearly with the number of non- 
zeros for OMP. Regarding the rate of successful reconstructions, we observe the following: 
Thresholding fails already for a moderate number of non-zero coefficients, even if we use Fourier 
coefficients with modulus one, i.e., a dynamic range R = 1, cf. Figure 2(a). Basis Pursuit 
and Orthogonal Matching Pursuit achieve much better reconstruction rates, where surprisingly 
OMP outperforms BP for larger dimensions D, cf. Figure 2(e). Moreover, BP shows the same 
behavior for different dynamic ranges, the discrete and the continuous probability model, cf. 
Figure 2(a,b,c). In contrast, OMP seems to take advantage of a larger dynamic range of the 
coefficients in the FFT- situation, cf. Figure 2(b). 
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(a) Success rate vs. sparsity M = 1, . . . , 40 for D - 
100, iV = 40, |cfe| = 1 for keT, FFT. 
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(c) Success rate vs. sparsity M = 1, . . . , 40 for D = 
100, iV = 40, NFFT. 





(b) Success rate vs. sparsity M 
D = 100, iV = 40, FFT. 




(d) Computation time in seconds vs. sparsity M — 
1, . . . , 40 for D = 100, A^ = 40, NFFT. 




(e) Success rate vs. sparsity M = 1, . . . , 40 for D = 
1000, N = 80, FFT. 



(f) Computation time in seconds vs. sparsity M = 
1, . . . , 40 for D = 1000, A^ = 80, FFT. 



Figure 2: Success rate and computation time of Thresholding (dashed), OMP (sohd), and 
BP (dash-dot) with respect to an increasing number M of non-zero Fourier coefficients. The 
number of samples N and the dimension D remain fixed; 100 runs have been conducted for 
each setting. 
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Example 4.2. The purpose of the second experiment is the comparison for real-valued and 
complex-valued coefficients. We consider the Basis Pursuit principle exclusively. As in the 
previous example, we let D = 100 but use only N = 30 sampling points. Again, we repeat the 
experiment 100 times for each level of sparsity. The results in Figure 3 reveal the following: 
Besides the easier implementation and a speed up by 25 percent, the additional assumption to 
have real valued coefficients indeed saves roughly half of the needed samples to recover a sparse 
trigonometric polynomial. 
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(a) Success rate of BP vs. sparsity M — 1, . . . , 40 
for D = 100, iV = 30. 




(b) Computation time of BP vs. sparsity M 
1,...,40 for D = 100, iV = 30. 



Figure 3: Success rate and computation time of Basis Pursuit for complex-valued coefficients 
with FFT (solid) and NFFT (dash-dot) and real-valued coefficients with FFT (dashed) and 
NFFT (dotted), with respect to an increasing number M of non-zero coefficients. The number 
of samples N and the dimension D remain fixed; 100 runs have been conducted for each setting. 

Example 4.3. This example comments on the generalized oversampling factor 9 = N/M 
and its relation to the rate of successful reconstructions. For a fixed dimension D = 1024 
and a fixed generalized oversampling factor 9 > 0, we randomly draw a support set T C T of 
increasing sizes M = 1, . . . , 40, random Fourier coefficients (with modulus one and a uniformly 
distributed phase for Thresholding), and sampling sets in the continuous probability model of 
size N = 9M. For 200 runs of each experiment, we count the number of perfect reconstructions 
by OMP after exactly M steps. As Figure 4(a) reveals, the success rate stays (almost) constant 
or might even increase slightly for an increasing number of non-zero coefficients if the ratio 
9 = N/M remains constant, cf. Theorem 2. 4 and 2.5. 

In the second part of this example, we test Thresholding, OMP, and BP. Our concern is 
the dependence of the ratio 9 = N/M to reach a certain success rate when the dimension D 
varies. For an increasing dimension D = 2^, 2'^, ... , 2^^, we draw support sets T G F of fixed 
sizes M = 4, 8, 16, 32 and random Fourier coefficients (normal distribution for OMP and BP, 
modulus one for Thresholding). We then test for the smallest number N of continuously drawn 
samples, such that at least 90 percent (^180 out of 200 j of the runs result in a perfect recovery 
of the given Fourier coefficients. Figure 4(b-d) confirm the relation 9 = C\log2{D) to reach a 
fixed success rate. In contrast to Thresholding, the constant C^ even decreases mildly for OMP 
and BP when using a larger number M of non-zero coefficients. 
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Both might indicate that the number of necessary samples is N = C M log{D / (M e)) rather 
than N = CM\og{D/e) for OMP and BP in an average case situation. Note furthermore, 
that the Gaussian ensemble indeed allows for reconstruction if this condition (2.9) is fulfilled. 




(a) Success rate of OMP vs. sparsity M = 
1, . . . , 40 for D = 1024 and N = OM with 6 = 2.5 
(solid), ^ = 3 (dashed), and = 3.5 (dash-dot). 
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16384 

(b) Oversampling factor 6 vs. dimension D for 
Thresholding (|cfc| = 1 for fc G T), 90% success. 




16384 



(c) Oversampling factor 6 vs. dimension D for Or- 
thogonal Matching Pursuit, 90% success. 




1024 16384 

(d) Oversampling factor 9 vs. dimension D for Ba- 
sis Pursuit, 90% success. 



Figure 4: Relation between the generalized oversampling factor 9 = N/M and the success rate 
for the continuous probability model. Figures (b)-(d): Oversampling factors 6 necessary to 
reach a success rate of 90 percent with respect to the dimension D and fixed numbers M = 4 
(solid), M = 8 (dashed), M = 16 (dotted), and M = 32 (dash-dot) of non-zero coefficients. 



Example 4.4. This example considers the computation time needed by OMP and BP for an 
increasing dimension D, a dependent number of non-zero coefficients M = 0{vD), and a 
number of samples N = 0{M\ogD). Constants are adjusted such that the used algorithms 
succeed in most cases in the reconstruction task; all methods, except LIMAGIC, are tested 
with complex coefficients. For small dimensions D, we draw the samples continuously and 
test OMP with the updated QR factorization and BP algorithms based on CVX, MOSEK, and 
LIMAGIC. 
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Furthermore, we test with a somewhat larger dimension and discrete drawn samples the 
algorithms: OMP with LSQR and explicitly stored matrices J'TsX, OMP with LSQR and FFT- 
based multiplications with J-'t^Xj BP using LIMAGIC and an explicitly stored matrix Tx for 
D < 2^"^, and BP using LIMAGIC and FFT-based multiplications with Tx- The FFT-based 
multiplications are denoted by implicit in Figure 5 and need no additional memory, whereas 
storing the matrices explicitly needs 0{D log D) bytes for OMP and 0{D^'^ log D) bytes for 
BP. 

Both OMP algorithms show a 0{D^'^ log D) time complexity, whereas the scheme with 
explicit storage of J-t^x is a constant multiple faster. The Basis Pursuit algorithms are con- 
siderably slower in all cases. Moreover, the storage of the whole measurement matrix Tx results 
in large memory requirements. 




1024 

(a) Average computation time (10 runs) vs. di- 
mension D = 2^2^,...,2^ M = Ll^J, 
and iV = M(log2(I>) - 2). Tested algorithms: 
OMP/QR/ update (solid), BP/MOSEK (dashed), 
BP/CVX (dash-dot), BP/LlMAGlC/real/expIicit 
(dotted), aU with NFFT. 
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(b) Computation 

- -,8 ol7 



1048576 
dimension D = 



and N = 2M logj D. 



time vs 

2^2^...,2l^M = y\^\ 

Tested algorithms: OMP/LSQR explicit (solid) 
and implicit (dashed); BP/LlMAGIC/real explicit 
(dash-dot) and implicit (dotted), all with FFT. For 
comparison: lO^^D^'^ logD (solid+diamond). 



Figure 5: Computation time in seconds with respect to an increasing dimension D and depen- 
dent numbers M and A^ of non-zero coefficients and samples, respectively. 

Example 4.5. Of course, stability under noise is important in practice. For Basis Pursuit 
this has been established already by Candes, Romberg and Tao in [9]. For Orthogonal Matching 
Pursuit we performed first numerical experiments in the following way. We corrupted the 
sample values with a significant amount of normal distributed noise. We observed that OMP 
usually finds the correct support set and makes only small errors on the coefficients, see Figure 
6 and also [34]. Thus, it seems that also OMP is stable under noise - at least if the noise level 
is not too high. (Actually, we observed that for moderately higher noise as in our example in 
Figure 6, the reconstructed coefficient vector is significantly different from the original). 

At last, note that both OMP and BP can be used to identify dominant frequencies from 
few samples - even if these frequencies are very high or if two (high) neighboring frequencies 
are present. Numerical tests showed for instance that the FFT based OMP can recover a 
signal consisting of 10 frequencies in {— 2^^, . . . , 2^^ — 1} - with ki = 2^^ — 1 = 262143, 
k2 = 2^^ — 2 = 262142 being two of them - from 60 random samples. 
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(a) Trig, polynomial and (noisy) samples. 
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(b) True and recovered coefficients. 



Figure 6: Trigonometric polynomial (real part) as in Figure 1 and 30 samples (o). The samples 
are disturbed by Gaussian distributed noise with variance 0.2 (x) (resulting here in a PSNR of 
15.6dB). Nevertheless OMP reconstructs the true support set of the coefficients, and the true 
coefficients (o) themselves with small error (x). 

5 Conclusions and future work 

Our theoretical and numerical results indicate that both Basis Pursuit and greedy algorithms 
such as Orthogonal Matching Pursuit and Thresholding are well-suited for the problem of 
recovering sparse trigonometric polynomials from few random samples taken either on a grid 
(FFT) or on the cube (NFFT). 

In practice, OMP obtains much better success rates than Thresholding. Moreover, our 
numerical results indicate that recovery success rates at reasonably small generalized oversam- 
pling rates N/M are very similar on generic signals for OMP and BP -although it is known that 
BP gives a uniform recovery guarantee while greedy algorithms only provide a non-uniform 
guarantee. As Theorem 2.4 just analyzes the first step of OMP it is still open to investigate 
theoretically the subsequent iterations. Due to stochastic dependency issues this does not seem 
to be straightforward, see Remark 3.4. 

Both greedy methods are significantly faster than standard methods for Basis Pursuit. In 
contrast, the series of papers [19, 20, 46, 47] following [27] considers algorithms that allow for 
asymptotic running time proportional to log^{D) for some /?. It would be very interesting to 
compare the two approaches numerically. Moreover, step 4 of Algorithm 1 might be replaced 
by so-called isolation and group testing techniques as used in [46, 47]. 

Typically, signals are not sparse in a strict sense but might still be well-approximated by 
sparse ones. Due to [9], recovery by Basis Pursuit is still possible with small errors provided 
that the restricted isometry constants are small. As was noted in Section 2.2, our random 
Fourier matrices J-x satisfy this condition [12, 38, 34]. 
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